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Abstract: A numerical method for variable coefficient elliptic problems on two di¬ 
mensional domains is described. The method is based on high-order spectral approx¬ 
imations and is designed for problems with smooth solutions. The resulting system 
of linear equations is solved using a direct solver with complexity for the 

pre-computation and 0{N log N) complexity for the solve. The fact that the solver 
is direct is a principal feature of the scheme, and makes it particularly well suited 
to solving problems for which iterative solvers struggle; in particular for problems 
with highly oscillatory solutions. This note is intended as a tutorial description of the 
scheme, and draws heavily on previously published material. 

1. Introduction 


( 1 ) 


1.1. Problem formulation and outline of solution strategy. This note describes a direct solver 
for elliptic PDEs with variable coefficients, such as, e.g., 

{ [Au]{x) = g{x), X € D, 

u{x) = f{x), xeT, 

where A is a variable coefficient elliptic differential operator 

(2) [Au]{x) = -cii{x)[dlu]{x) - 2ci2{x)[did2u]{x) - C 22 {x)[dlu]{x) 

+ Ci{x)[diu]{x) -b C2{x)[d2u]{x) + c(x) u{x), 

where D is a box in the plane with boundary P = dD, where all coefficient functions (c, Cj, Cij) are 
smooth, and where / and g are given functions. (For generalizations, see Section 1.4 ) The solver is 
structured as follows: 


(1) The domain is first tessellated into a hierarchical tree of patches. For each patch on the 
hnest level, a reduced model that we call a “proxy” that represents its internal structure is 
computed. The proxy takes the form of a dense matrix (that may be stored in a data sparse 
format), and is for the small patches computed by brute force. 

(2) The larger patches are processed in an upwards pass through the tree. For each patch, its 
proxy matrix is formed by merging the proxies of its children. 

(3) Once the proxies for all patches have been computed, a solution to the PDF can be computed 
via a downwards pass through the tree. This step is typically very fast. 

We observe that this pattern is similar to the classical nested dissection method of George [5], with 
a large subsequent literature on “multifrontal solver,” see, e.g., HE] and the references therein. 

The techniques described in this note are drawn from [121 HE] E] [6|, which in turn is inspired by 
earlier work on multidomain spectral methods, see, e.g., mm and the references therein. 


1.2. Dirichlet-to-Neumann maps. In this section, the internal structure of a patch is represented 
by computing its Dirichlet-to-Neumann, or “DtN,” map. To explain what this map does, first observe 
that for a given boundary function /, the BVP Q typically has a unique solution (j) (unless the 
operator A happens to have a non-trivial null-space, see Remark [^. Now simply form the boundary 
function h that gives the normal derivative of the solution, 

h{x) = (t)n{x), 

1 


X €T 



where (l)n is the outwards pointing normal derivative. The process for constructing the function h 
from / is linear, and we write it as 

h = Tf. 


Or, equivalently. 


T : (^|r (j)n\r. 


where cf) satisfies A([) = 0. 


In general, the map T is a slightly unpleasant object; it behaves as a differentiation operator, and 
it has complicated singular behavior near the corners of T. A key observation is that in the present 
context, all these difficulties can be ignored since we limit attention to functions that are smooth. In 
a sense, we only need to accurately represent the projection of the “true” operator T onto a space 
of smooth functions (that in particular do not have any corner singularities). 

We represent boundary functions by tabulating their values at interpolation nodes on the edges of 
the boxes. For instance, for a leaf box, we place q Gaussian nodes on each side (for say q = 20), 
which means that the functions / and h are represented by vectors f, h G and the discrete 
approximation to T is a 4(7 x Aq matrix T. The technique for computing T for a leaf box is described 
in Section For a parent box r with children a and /3, there is a technique for computing from 
the matrices T“ and that is described in Section In essence, the idea is simply to enforce 
continuity of both potentials and fluxes across the edge that is shared by Oq, and 

Remark 1. For a general BVP like 0, the DtN operator need not exist. For example, suppose 
—k"^ is an eigenvalue of A with zero Dirichlet data. Then the operator Au = Au + kfu clearly has a 
non-trivial null-space. However, this situation is in some sense “unusual” (since the spectrum of an 
elliptic PDO on a bounded domain typically is discrete), and it turns out that one can for the most 
part completely ignore this complication and assume that the DtN operator always exists. Note for 
instance that if A is coercive (e.g. if A = —A), then the DtN is guaranteed to exist for any bounded 
domain. For problems for which resonances do present numerical problems, there is a variation of 
the proposed method that is rock-solid stable. The idea is to build a hierarchy of so called impedance 
maps (instead of DtN maps). These are cousins of the DtN that are defined by 

R : {4> + i<?in)|r — i<('n)|r) where satisfies Afi = 0. 

The map R always exists, and is moreover a unitary operator. This construction was proposed by 
Alex Barnett of Dartmouth College [7|. 


1.3. Complexity of the direct solver. The asymptotic complexity of the solver described in this 
note is exactly the same as that for classical nested dissection [5|. For a domain with N interior 
discretization nodes, the pre-computation (the upwards pass) costs 0{N^-^) operations, and then the 
solve (the downwards pass) costs 0{NlogN) operations, see [121 Sec. 5.2]. 

Optimal 0{N) complexity can be achieved for both the pre-computation and the solve stage when 
the Green’s function of the BVP is non-oscillatory. In this case, the matrices T” that approximate the 
DtN operators have enough internal structure that they can be well represented using a data-sparse 
format such as, e.g., the Ff-matrix framework of Hackbusch and co-workers [alEllIlia, see [6]. 


1.4. Generalizations. For notational simpliticy, this note treats only the simple Dirichlet problem 
0 . The scheme can with trivial modihcations be applied to more general elliptic operators coupled 
with Dirichlet, Neumann, or mixed boundary data. It has for instance been successfully tested on 
convection-diffusion problems that are strongly dominated (by a factor of 10“^) by the convection 
term. It has also been tested on vibration problems modeled by a variable coefficient Helmholtz 
equation, and has proven capable of solving problems on domains of size 200 x 200 wavelengths or 
more on an office laptop computer. The solutions are computed to seven correct digits. Extension 
to more general domains is done via parameter maps to a rectangle, or union of rectangles. See m 
for details. 
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Figure 1. The square domain 0 is split into 4x4 leaf boxes. These are then gathered 


into a binary tree of successively larger boxes as described in Section 5.1 One possible 


enumeration of the boxes in the tree is shown, but note that the only restriction is 
that if box r is the parent of box a, then t < a. 


1.5. Outline. To keep the presentation uncluttered, we start by describing a direct solver for 0 
for the case of no body-load, 5 = 0, and with 0 = [0,1]^ the unit square. Section introduces the 
discretization, and how we represent the approximation to the solution u for Q. Section describes 
the how to compute the DtN operator for a leaf in the tree. Section describes the merge process 
for how to take the DtN operators for two touching boxes, and computing the DtN operator for their 
union. Section describes the full hierarchical scheme. Section describes how to solve a problem 
with body loads. 


2. Discretization 


Partition the domain D into a collection of square (or possibly rectangular) boxes, called leaf boxes. 
On the edges of each leaf, place q Gaussian interpolation points. The size of the leaf boxes, and the 
parameter q should be chosen so that any potential solution u of 0 , as well as its first and second 
derivatives, can be accurately interpolated from their values at these points {q = 20 is often a good 
choice). Let denote the collection of interpolation points on all boundaries. 

Next construct a binary tree on the collection of leaf boxes by hierarchically merging them, making 
sure that all boxes on the same level are roughly of the same size, cf. Figure The boxes should be 
ordered so that if r is a parent of a box a, then t < a. We also assume that the root of the tree 
(i.e. the full box D) has index r = 1. We let D,- denote the domain associated with box r. 

With each box r, we define two index vectors I[ and If as follows: 


If A list of all exterior nodes of r. In other words, k € If iS lies on the boundary of Hr- 
If For a parent r. If is a list of all its interior nodes that are not interior nodes of its children. 
For a leaf t, If is empty. 


Let u G denote a vector holding approximations to the values of u of in other words, 


u{k) u{xk). 


Finally, let v G denote a vector holding approximations to the boundary fluxes of the solution u 
of ([^, in other words 


yj{k) 


d 2 u{xk), when xj lies on a horizontal edge, 

diu{xk), when Xj lies on a vertical edge. 


Note the sign convention for the normal derivatives: we use a global frame of reference, as opposed 
to distinguishing between outwards and inwards pointing normal derivatives. This is a deliberate 
choice to avoid problems with signs when matching fluxes of touching boxes. 
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3. Constructing the Dirichlet-to-Neumann map for a leaf 


This section describes a spectral method for computing a discrete approximation to the DtN map 
T'^ associated with a leaf box fl,-. In other words, if r is a solution of Q, we seek a matrix of 
size 4g x Aq such that 

.3) <Il) ^ T" 

' Neumann data DtN map Dirichlet data 


Conceptually, we proceed as follows: Given a vector u(/J) G specifying the solution u on the 
boundary of fir, form for each side the unique polynomial of degree at most q — 1 that interpolates 
the q specified values of u. This yields Dirichlet boundary data on Dr in the form of four polynomials. 
Solve the restriction of 0 to Dr for the specified boundary data using a spectral method on a local 
tensor product grid oi p x p Chebyshev nodes (typically, we choose p = q + !)■ The vector v(Ig) is 
obtained by spectral differentiation of the local solution, and then re-tabulating the boundary fluxes 
to the Gaussian nodes in {ajfcjfce/j- 


We give details of the construction in Section 3.2 but as a preliminary step, we hrst review a classical 
spectral collocation method for the local solve in Section [3.1| 


Remark 2. Chebyshev nodes are ideal for the leaf computations, and it is in principle also possible 
to use Chebyshev nodes to represent all boundary-to-boundary “solution operators” such as, e.g., 
(indeed, this was the approach taken in the first implementation of the proposed method |13j ). 
However, there are at least two substantial benefits to using Gaussian nodes that justify the trouble to 
retabulate the operators. First, the procedure for merging boundary operators defined for neighboring 
boxes is much cleaner and involves less bookkeeping since the Gaussian nodes do not include the 
corner nodes. (Contrast Section f of |13] with Section w Second, and more importantly, the use of 
the Gaussian nodes allows for interpolation between different discretizations. Thus the method can 
easily be extended to have local refinement when necessary, see Remark^ 


3.1. Spectral discretization. Let D^ denote a rectangular subset of D with boundary T,-, and 
consider the local Dirichlet problem with the boundary data given by a “dummy” function tp 


( 4 ) 


[Au]{x) = 0, 
u{x) = 'p{x), 


X G Dr, 
X ^ T r, 


where the elliptic operator A is defined by Q. We will construct an approximate solution to Q 

using a classical spectral collocation method described in, e.g., Trefethen US]: First, pick a small 
2 

integer p and let {zkY^^^i denote the nodes in a tensor product grid oipxp Chebyshev nodes on D,-. 
Let and denote spectral differentiation matrices corresponding to the operators d/dxi and 
3(3x2, respectively. The operator 0 is then locally approximated via the x p^ matrix 

(5) A = -Cii(dW)^ - 2Ci2D(i)d(2) - C22{D^^^f + + CaD^^) + C, 


where Cn is the diagonal matrix with diagonal entries and the other matrices Cjj, Cj, 

C are defined analogously. 

Let w G denote a vector holding the desired approximate solution of 0. We populate all 
entries corresponding to boundary nodes with the Dirichlet data from if, and then enforce a spectral 
collocation condition at the interior nodes. To formalize, let us partition the index set 


{1, 2, ...,p2} = j^u Ji 


in such a way that Je contains the 4(p — 1) nodes on the boundary of Dr, and denotes the set of 
{p — interior nodes, see Figure 0a). Then partition the vector w into two parts corresponding to 
internal and exterior nodes via 

Wi=w(Ji), We = w(Je). 
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(a) (b) 

Figure 2. Notation for the leaf computation in Section]^ (a) A leaf before elimina¬ 
tion of interior (white) nodes, (b) A leaf after elimination of interior nodes. 

Analogously, partition A into four parts via 

Aiji = A(Ji, Aj^e — Je)^ ^e,i — ^{Jej ^e,e — A(Je) Je)- 
The potential at the exterior nodes is now given directly from the boundary condition: 

We = • 

For the internal nodes, we enforce the PDF (|^ via direct collocation: 

(6) Ai^i Wi Ai,e We = 0. 

Solving Q for Wi, we find 

(7) Wi = -Ar.l Ai^eWe, 

3.2. Constructing the approximate DtN. Now that we know how to approximately solve the 
local Dirichlet problem (Q via a local spectral method, we can build a matrix such that Q holds 
to high accuracy. The starting point is a vector u(/t-) G of tabulated potential values on the 
boundary of Qr- We will construct the vector \i{It) £ 1^^^ via four linear maps. The combination of 
these maps is the matrix T^. 

Step 1 — re-tabulation from Gaussian nodes to Chebyshev nodes: For each side of Qr, 
form the unique interpolating polynomial of degree at most q — 1 that interpolates the q potential 
values on that side specified by u(/J). Now evaluate these polynomials at the boundary nodes of 
a p X p Chebyshev grid on fl,-. Observe that for a corner node, we may in the general case get 
conflicts. For instance, the potential at the south-west corner may get one value from extrapolation 
of potential values on the south border, and one value from extrapolation of the potential values 
on the west border. We resolve such conflicts by assigning the corner node the average of the two 
possibly different values. (In practice, essentially no error occurs since we know that the vector u(/J) 
tabulates an underlying function that is continuous at the corner.) 

Step 2 — spectral solve: Step 1 populates the boundary nodes of the p x p Chebyshev grid with 
Dirichlet data. Now determine the potential at all interior points on the Chebyshev grid by executing 
a local spectral solve, cf. equation Q. 

Step 3 — spectral dijferentiation: After Step 2, the potential is known at all nodes on the 
local Chebyshev grid. Now perform spectral differentiation to evaluate approximations to duldx 2 
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Figure 3. Notation for the merge operation described in Section]^ The rectangular 
domain is formed by two squares and The sets Ji and J 2 form the exterior 
nodes (black), while J3 consists of the interior nodes (white). 

for the Chebyshev nodes on the two horizontal sides, and du/dxi for the Chebyshev nodes on the 
two vertical sides. 


Step 4 — re-tabulation from the Chebyshev nodes back to Gaussian nodes: After Step 3, 
the boundary fluxes on didr are specified by four polynomials of degree p—1 (specified via tabulation 
on the relevant Chebyshev nodes). Now simply evaluate these polynomials at the Gaussian nodes on 
each side to obtain the vector v(/J). 

Putting everything together, we find that the matrix is given as a product of four matrices 

= U o L3 o L2 o Li 

4g X 4q 4q x 4p 4p x p^ p^ x 4{p — 1) 4{p — 1) x 4q 

where Lj is the linear transform corresponding to “Step i” above. Observe that many of these 
transforms are far from dense, for instance, Li and L 4 are 4x4 block matrices with all off-diagonal 
blocks equal to zero. Exploiting these structures substantially accelerates the computation. 


4. Merging two DtN maps 


Let T denote a box in the tree with children a and /3. In this section, we demonstrate that if the 
DtN matrices T“ and for the children are known, then the DtN matrix can be constructed 
via a purely local computation which we refer to as a “merge” operation. 

We start by introducing some notation: Let Hr denote a box with children Dq, and For concrete¬ 
ness, let us assume that Dq, and Up share a vertical edge as shown in Figure]^ so that 

Hr — U Up * 

We partition the points on dila and dHp into three sets: 


Ji Boundary nodes of ila that are not boundary nodes of ilp. 

J 2 Boundary nodes of that are not boundary nodes of Dq. 

J 3 Boundary nodes of both and Up that are not boundary nodes of the union box D,-. 


Figure [^illustrates the definitions of the J^’s. Let u denote a solution to Q, with tabulated potential 
values u and boundary fluxes v, as described in Section Set 


and Ue 


Ul 

U2 


( 8 ) 


Ui = U3, 
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Recall that T“ and denote the operators that map values of the potential u on the boundary to 
values of dnU on the boundaries of the boxes flo, and fl/j, as described in Section]^ The operators 
can be partitioned according to the numbering of nodes in Figure resulting in the equations 


(9) 


Vi 


r T“ 

' 1,1 

T“ 1 

' 1,3 


Ul 

''3 


T“ 

L ' 3,1 

-ra 

' 3,3 J 


“3 


and 


V2 


' 2,2 

' 2,3 


U2 

V3 


' 3,2 

' 3,3 


"3 


Our objective is now to construct a solution operator and a DtN matrix such that 
(10) U3 = 


Ui 

U2 


( 11 ) 


Vi 

V2 


Ui 

»2 


To this end, we eliminate V 3 from ([^ and write the result as a single equation: 



T" 

' 1,1 

0 

T“ 

' 1,3 

(12) 

0 

' 2,2 

1- 


T" 

3,1 

1- 

1 

T“ — 

' 3,3 ' 3,3 


ui 

U2 


U 3 


Vi 

V2 


The last equation directly tells us that (0 holds with 
(13) = (T 3 , 3 -TQ"^[-T 

By eliminating U 3 from (12) by forming a Schur complement, we also find that (0 holds with 


a 

3,1 


' 3,21 


(14) 


' 1,1 

0 


0 


2,2 J 


+ 


1,3 


2,3 J 




a. 

3,1 


' 3,2j 


5. The full hierarchical scheme 

At this point, we know how to construct the DtN operator for a leaf (Section]^, and how to merge 
two such operators of neighboring patches to form the DtN operator of their union (Section]^. We 
are ready to describe the full hierarchical scheme for solving the Dirichlet problem ([^. This scheme 
takes the Dirichlet boundary data /, and constructs an approximation to the solution u. The output 
is a vector u that tabulates approximations to u at the Gaussian nodes {xk}k=i interior edges 

that were defined in Section To hnd u at an arbitrary set of target points in D, a post-processing 
step described in Section |5.3| can be used. 

5.1. The algorithm. Partition the domain into a hierarchical tree as described in Section Then 
execute a “build stage” in which we construct for each box r the following two matrices: 

S'^ For a parent box r, is a solution operator that maps values of u on d^lr to values of u at 
the interior nodes. In other words, u{I[) = S'” u(IJ). (For a leaf r, S” is not defined.) 

T”” The matrix that maps u(/J) (tabulating values of u on dQr) to v(/J) (tabulating values of 
du/dn). In other words, v(/J) = T”” u(/g ). 

(Recall that the index vectors and I[ were defined in Section]^) The build stage consists of a 
single sweep over all nodes in the tree. Any bottom-up ordering in which any parent box is processed 
after its children can be used. For each leaf box r, an approximation to the local DtN map T^ is 
constructed using the procedure described in Section For a parent box r with children a and 13, 
the matrices and T"” are formed from the DtN operators T“ and via the process described in 
Section Algorithm 1 summarizes the build stage. 
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Algorithm 1 (build solution operators) 

This algorithm builds the global Dirichlet-to-Neumann operator for Q. 

It also builds all matrices required for constructing u at any interior point. 
It is assumed that if node r is a parent of node a, then t < a. 


for T — A^boxes) A^boxes 1; A^boxes 2, . . . , 1 
if (r is a leaf) 

Construct via the process described in Section 

else 

Let a and /3 be the children of r. 

Split /" and if into vectors Ii, I 2 , and I 3 as shown in Figure |3 


( 1 ) 

( 2 ) 

(3) 

(4) 

(5) 

( 6 ) 

(7) 

( 8 ) 

(9) 

( 10 ) 

(11) end for 


3,3 


l 

Delete T" and T^. 

end if 


tC3)"'[-T?,i I tI,] 


r T“ 

' 1,1 

0 

+ 


0 

' 2,2 J 

L ' 2,3 J 


Algorithm 2 (solve BVP once solution operator has been built) 

This program constructs an approximation u to the solution u of (§. 

It assumes that all matrices have already been constructed in a pre-computation. 
It is assumed that if node r is a parent of node a, then t < a. 


(1) u(A:) = f { xk ) for all k G 

( 2 ) for T = 1, 2, 3, ..., Aboxes 

(3) if (r is a parent) 

(4) u(/n = S"u(/J). 

(5) end if 

(6) end for 


Remark: This algorithm outputs the solution on the Gaussian nodes on box bound¬ 
aries. To get the solution at other points, use the method described in Section 5.3. 


Once all the matrices {S’^jr have been formed, a vector u holding approximations to the solution 
u of Q can be constructed for all discretization points by starting at the root box D and moving 
down the tree toward the leaf boxes. The values of u for the points on the boundary of D can be 
obtained by tabulating the boundary function /. When any box r is processed, the value of u is 
known for all nodes on its boundary (i.e. those listed in JJ). The matrix directly maps these 
values to the values of u on the nodes in the interior of r (i.e. those listed in I^). When all nodes 
have been processed, approximations to u have constructed for all tabulation nodes on interior edges. 
Algorithm 2 summarizes the solve stage. 

Remark 3. The merge stage is exact when performed in exact arithmetic. The only approximation 
involved is the approximation of the solution u on a leaf by its interpolating polynomial. 

Remark 4. To keep the presentation simple, we consider in this note only the case of a uniform 
computational grid. Such grids are obviously not well suited to situations where the regularity of 
the solution changes across the domain. The method described can in principle be modified to handle 
locally refined grids quite easily. A complication is that the tabulation nodes for two touching boxes will 
typically not coincide, which requires the introduction of specialized interpolation operators. Efficient 














refinement strategies also require the development of error indicators that identify the regions where 
the grid need to be refined. This is work in progress, and will be reported at a later date. We observe 
that our introduction of Gaussian nodes on the internal boundaries (as opposed to the Chebyshev 
nodes used in m) makes re-interpolation much easier. 


5.2. Asymptotic complexity. In this section, we determine the asymptotic complexity of the 
direct solver. Let A^ieaf = denote the number of Gaussian nodes on the boundary of a leaf box, 
and let denote the number of Chebychev nodes used in the leaf computation. Let L denote the 
number of levels in the binary tree. This means there are 4^ boxes. Thus the total number of 
discretization nodes N is approximately A^q = . (To be exact, N = + 2^~^^q.) 

The cost to process one leaf is approximately 0{q^). Since there are ^ leaf boxes, the total cost of 
pre-computing approximate DtN operators for all the bottom level is ^ x q^ ^ Nq^. 


Next, consider the cost of constructing the DtN map on level i via the merge operation described in 
Section]^ For each box on the level i, the operators and are constructed via (13) and (13). 
These operations involve matrices of size roughly 2 ~^N^'^ x 2 ~^N^'^. Since there are 4^ boxes per 
level. The cost on level ^ of the merge is 


4^^ X 


2-£^o.5 


2~^N^-^. 


The total cost for all the merge procedures has complexity 

£=1 


Finally, consider the cost of the downwards sweep which solves for the interior unknowns. For any 
non-leaf box r on level £, the size of is 2 ^q x 2 \6q) which is approximately 2 -e^o.5 

X 2' -^AO-5. 

Thus the cost of applying is roughly (2~^N^'^)‘^ = 2 ~‘^^N. So the total cost of the solve step has 
complexity 

^22^2“2^A ~ A log A. 

z=o 


In [6], we explain how to exploit structure in the matrices T and S to improve the computational 
cost of both the precomputation and the solve steps. 


5.3. Post-processing. The direct solver in Algorithm 1 constructs approximations to the solution 
rt of ([^ at tabulation nodes at all interior edges. Once these are available, it is easy to construct 
an approximation to u at an arbitrary point. To illustrate the process, suppose that we seek an 
approximation to u{y), where y is a point located in a leaf r. We have values of u tabulated 
at Gaussian nodes on 80.^- These can easily be re-interpolated to the Chebyshev nodes on dklr. 
Then u can be reconstructed at the interior Chebyshev nodes via the formula ([^; observe that 
the local solution operator — AjTj^Ai^e was built when the leaf was originally processed and can be 
simply retrieved from memory (assuming enough memory is available). Once u is tabulated at the 
Chebyshev grid on klr, it is trivial to interpolate it to y or any other point. 
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6. Body loads 


Now that we have described how to solve our basic boundary value problem 


(15) 


[Au]{x) 

u{x) 


9 { x ), 

fix), 


X € Q, 

X €r, 


for the special case where g = 0, we will next consider the more general case that includes a body 
load. Only minor modifications are required to the basic scheme, but note that the resulting method 
requires substantially more memory. The techniques presented here were developed jointly with 
Tracy Babb of CU-Boulder, cf. [TO] . 


6.1. Notation. When handling body loads, we will extensively switch between the Chebyshev and 
Gaussian grids, so we need to introduce some additional notation. 

Let {yj}^i denote the global grid obtained by putting down apxp tensor product grid of Chebyshev 
nodes on each leaf. For a leaf r, let denote an index vector pointing to the nodes in {yj}jLi that 
lie on leaf r. We partition this index vector into exterior and interior nodes as follows 

r = r u F- 

To avoid confusion with the index vectors pointing into the grid of Gaussian boundary functions, we 
rename these index vectors as follows: 

/ggi An index vector marking the (Gauss) nodes in that lie on d^r- 

/gj: For a parent node r, this is an index vector marking the (Gauss) nodes in {xi}f^^ that lie 
on the “interior” boundary of r. For a leaf node, this vector is not defined. 


The fact that we use two spectral grids also leads to a need to distinguish between different vectors 
tabulating approximate values. We have: 



r is a leaf 

r is a parent 

Uc 

u tabulated on Chebyshev nodes 

— 

Ke 

u tabulated on Chebyshev exterior nodes 

— 

Uci 

u tabulated on Chebyshev interior nodes 

— 

^ge 

U^i 

u tabulated on Gaussian exterior nodes 

u tabulated on Gaussian exterior nodes 
u tabulated on Gaussian interior nodes 


6.2. Leaf computation. Let r be a leaf. We split the solution u to the equation 

f Auix) = qix), X G CLr, 


u{x) = V'(®), 


as 


u = U) + I 


where tc is a particular solution 


(17) 


Aw{x) = g{x), 
w{x) = 0, 


and where (j) \s a homogeneous solution 


A(f){x) = 0, 
(i){x) = tpix), 


X € Qr, 
X G F^, 

X G Ot, 
X G F.^. 


We can now write the Neumann data for u as 

dnu\r^ = dnw\r^ + dn4>\r^ = dnw\r^ + T(p\r^ = dnw\r^ + F V', 
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where, as before, T is the NfD operator. Our objective is therefore to hnd a matrix that maps the 
given body load g on to the Neumann data of w. We do this calculation on the Chebyshev grid. 
Discretizing and collocating on the internal nodes, we find 

Aci,ceWce “1“ Aci^ciWci = g^.;. 

But now observe that Wce = 0, so the particular solution is given by 


Wr 


Wo 

Wo 


— Fc,cigci5 


where 


0 

■ 

Cl,Cl J 


Let hge denote the Neumann data for w, tabulated at the Gaussian nodes on the boundary. It follows 
that 

hge = Dge^cFc,ci Sci) 

'-V-' 

—. Hg0 ci 

where Dge^c is the operator that differentiates from the full Chebyshev grid, and then interpolates to 
the Gaussian exterior nodes. (Using the notation from Section 0 Dge,c — L4L3.) 

Once the boundary data of u on is given, the total solution on the Chebyshev grid is given by 

Uc — Sc^geUge + ^CjCiSci — Sc^geLlge + W^, 


where Sc ge is the solution operator given by 


'c,ge 


I 

Cl,Cl 


ce,ge? 


where, in turn, lce,ge is the interpolation operator that maps from the boundary Gauss nodes to the 
boundary Chebyshev nodes. 


6.3. Merge operator. Let r be a parent node with children a and /?. The local equilibrium equa¬ 
tions read 


Vi 


''3 

- 

V2 


. ''3 



"YO. "YOL 

* 1,1 * 1,3 

-pet -pet 

* 3,1 ■ 3,3 


' 2,2 

I Q 


' 2,3 

I Q 



Ul 

- 

“3 


U2 


"3 


+ 


+ 


h? 


(19) 

( 20 ) 

^ 3,2 ' 3,3 

Combine the two equations for V 3 to obtain the equation 

ui + U3 + = 7^2 U2 + Tf 3 U3 + hj. 

This gives [check for sign errors!] 

(21) U3 = (T“3 - Tj3)-'(Tj2U2 - T“iUi + hf - h“) 


Inserting (21) back into (19) we find 


Vl 

- (\ 

''2 

u 


1,1 


0 


0 


+ 


^2 J 


+ 


2,2 J 

T" 

' 1,3 

' 2,3 J 


1.3 

2.3 J 




3,2J 




Hi 

H2 


+ 
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We now define operators 

K.,i = (T",3 - Tg,s) 

Sk,.= {TJ,3-T^,3)-'[-T: 


r'. 

,-l [ 


3,1 


<2l = XJ,,g,[-ni 


ge,ge 


' 1,1 

0 


0 

' 2,2 


(T?,3-Ty 


-P 1 
3,2J> 

T 


3,2J 


1,1 

0 


2,2 


'gi,ge- 


Then in the upwards pass in the solve, we compute 

w 


gi 


x'K-hj) 


■ge 


T“ 

' 1,3 

' 2,3 J 


W 


go 


and in the solve stage, we recover Ugj via 


gi 


^gi,ge^ge + ''''gi- 


( 22 ) 


Remark 5 (Physical interpretation of merge). The quantities Wgj and hgg have a simple physical 
meaning. The vector Wgj introduced above is simply a tabulation of the particular solution w'^ asso¬ 
ciated with T on the interior boundary Pa, and hgg is the normal derivative of w'^. To be precise, w'^ 
is the solution to the inhomogeneous problem, cf. 0 

Aw'^{x) = g{x), X G Tlr, 

w'^{x) = 0, ® G r,-. 

We can re-derive the formula for ic|r 3 using the original mathematical operators as follows: First 
observe that for x G n", we have A{w'^ — w^) = g — g = 0, so the NfD operator T“ applies to the 

fimcfinTi 1!)'^ — ’ 

Ti,(w\ - O + - O = - (a„«,“)|3 

Use that wl = wf = = 0, and that {dnW ^)\3 = hf to get 

(23) = {dnw^)\3 - hf. 

Analogously, we get 

(24) T^^wl = {dnw'^)\3 - h§. 

Combine (23) and (24) to eliminate (clnin ’^)|3 and obtain 

{T--Ti,)wl = -h^ + hl 

Observe that in effect, we can write the particular solution w'^ as 

w°‘{x)-\-w'^{x) a; G 12“, 


w'^(x) = 


w^{x) + w'^{x) X G 12^, 


The function w'^ must of course be smooth across Pa, so the function w'^ must have a jump that exactly 
offsets the discrepancy in the derivatives of tc" and . This jump is precisely of size h°‘ — h^. 
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Algorithm 3 (Build stage for problems with body load) 

This algorithms build all solution operators required to solve the non- 


homogeneous BVP (16). It is assumed that if node r is a parent of node a, 
then T < a. 

for T — A^boxes) A^boxes 1; A^boxes 2, . . . , 1 
if (r is a leaf) 


0 


H 


ge,ci 


c,ge 


ge,ge 


D, 


-1 


Cl,Cl 

!,c' c,ci 


[pot.] -It- [body load] 
[deriv.] •(— [body load] 


Cl,Cl 


ce,ge 


[pot] ^ [pot] 


Dge,cSc,ge [deriv.] •(— [pot.] (NfD operator) 


else 


Let a and /3 be the children of r. 

Partition I" and into vectors Ii, I 2 , and I 3 as shown in Figure [s 
>^gi,gi = (T 3,3 - ^^ 3 )"^ [pot.] ^ [deriv.] 

Sgi,ge = >^gi,gi [-T?,i I TJ 2 ] [pot] ^ [pot] 


ge,ge 


r T" 

' 1,1 

0 

+ 

H 

CO 

0 

' 2,2 J 

1 ' 2,3 J 


u 


gbge 


[deriv.] •(— [pot.] (NfD operator). 


end if 
end for 


Figure 4. Build stage. 
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Algorithm 4 (Solver for problems with body load) 

This program constructs an approximation u to the solution u of Q. It assumes 
that all matrices required to represent the solution operator have already been 
constructed using Algorithm 3. It is assumed that if node r is a parent of node 
u, then T < a. 


Upwards pass — construct all particular solutions: 

for T A^boxes) A^boxes 1; A^boxes 2, • . • , 1 
if (r is a leaf) 

# Compute the derivatives of the local particular solution. 


else 


■ge 


H 


ge,ci Sci 


Let a and j3 denote the children of r. 

# Compute the local particular solution. 


w 


+ ^’s)- 


gi ^gbgi V ^3 
# Compute the derivatives of the local particular solution. 


"ge 

end if 
end for 


"1 


2 J 


+ 


T“ 

' 1,3 

' 2,3 J 


W 


gi' 


Downwards pass — construct all potentials: 

ff Use the provided Dirichlet data to set the solution on the exterior of the root. 
u{k) = f{xk) for all k £ I^. 
for T 1, 2, 3, A^boxes 
if (r is a parent) 

# Add the homogeneous term and the particular term. 


else 


gi 




+ W 


go 


# Add the homogeneous term and the particular term. 

“c '^c,ge ^ge ' ’ c,ci oci 

end 
end for 


Figure 5. Solve stage. 
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Appendix A. A graphical illustration of the algorithm 


This se ction provides an illustrated overview of the hierarchical merge process described in detail in 
Section 5.1, The figures illustrate a situation in which a square domain = [0,1]^ is split into 4x4 


leaf boxes on the finest level, and a 6 x 6 spectral grid is used in each leaf. 


Step 1: Partition the box H into 16 small boxes that each holds an 6 x 6 Cartesian mesh of Chebyshev 
nodes. For each box, identify the internal nodes (marked in blue) and eliminate them as described 
in Section Construct the solution operator U, and the DtN operators encoded in the matrices V 
and W. 


• • • • • • 

I • • • • < 

»••••« 

»••••« 

• • • • • i 

I • • • • « 

»••••« 

» • • • • i 

I • • • • i 

»••••« 

»••••« 

» • • • • i 

B • • • • fl 

§••••« 

»••••« 

• • • • • 

I • • • • • 

»••••« 

»••••« 

• • • • • • 

1 • • • • « 

»••••« 

B • • • • 4 

» • • • • • 

I • • • • fl 

»••••« 

»••••« 

» • • • • • 

I • • • • fl 

B • • • • 4 

B • • • • 4 



Step 2: Switch tabulation points on the boundary from Chebyshev nodes to Legendre nodes. The 
main purpose is to remove the corner nodes. 




Step 3: Merge the small boxes by pairs as described in Section]^ The equilibrium equation for each 
rectangle is formed using the DtN operators of the two small squares it is made up of. The result 
is to eliminate the interior nodes (marked in blue) of the newly formed larger boxes. Construct the 
solution operator U and the DtN matrices V and W for the new boxes. 
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Step 4: Merge the boxes created in Step 3 in pairs, again via the process described in Section 



Step 3 


Step 5: Repeat the merge process once more. 



Step 4 
=> 


Step 6: Repeat the merge process one final time to obtain the DtN operator for the boundary of 
the whole domain. 


16 










































Step 5 


References 

[1] Mario Bebendorf, Hierarchical matrices, Lecture Notes in Computational Science and Engineering, vol. 63, 
Springer-Verlag, Berlin, 2008, A means to efficiently solve elliptic boundary value problems. MR 2451321 
(2009k:15001) 

[2] Steffen Borm, Ejficient numerical methods for non-local operators, EMS Tracts in Mathematics, vol. 14, European 
Mathematical Society (EMS), Zurich, 2010, "H^-matrix compression, algorithms and analysis. MR 2767920 

[3] Timothy A Davis, Direct methods for sparse linear systems, vol. 2, Siam, 2006. 

[4] I.S. Duff, A.M. Erisman, and J.K. Reid, Direct methods for sparse matrices, Oxford, 1989. 

[5] A. George, Nested dissection of a regular finite element mesh, SIAM J. on Numerical Analysis 10 (1973), 345-363. 

[6] A. Gillman and P. Martinsson, A direct solver with o(n) complexity for variable coefficient elliptic pdes discretized 
via a high-order composite spectral collocation method, SIAM Journal on Scientific Computing 36 (2014), no. 4, 
A2023-A2046, arXiv.org report ^1307.2665. 

[7] Adrianna Gillman, AlexH. Barnett, and Per-Gunnar Martinsson, A spectrally accurate direct solution technique for 
frequency-domain scattering problems with variable media, BIT Numerical Mathematics 55 (2015), no. 1, 141-170 
(English). 

[8] W. Hackbusch, B. Khoromskij, and S. Sauter, On T-L^-matrices. Lectures on Applied Mathematics, Springer Berlin, 
2002, pp. 9-29. 

[9] Wolfgang Hackbusch, A sparse matrix arithmetic based on H-matrices; Part I: Introduction to H-matrices, Com¬ 
puting 62 (1999), 89-108. 

[10] T.S. Haul, T. Babb, P.G. Martinsson, and B.A. Wingate, A high-order scheme for solving wave propagation 
problems via the direct construction of an approximate time-evolution operator, arXiv preprint arXiv: 1402.5168 
(2014). 

[11] J.S. Hesthaven, P.G. Dinesen, and J.P. Lynov, Spectral collocation time-domain modeling of diffractive optical 
elements. Journal of Computational Physics 155 (1999), no. 2, 287 - 306. 

[12] P.C. Martinsson, A composite spectral scheme for variable coefficient helmholtz problems, arXiv preprint 
arXiv:1206.4136 (2012). 

[13] P.C. Martinsson, A direct solver for variable coefficient elliptic pdes discretized via a composite spectral collocation 
method. Journal of Computational Physics 242 (2013), no. 0, 460 - 479. 

[14] H.P. Pfeiffer, L.E. Kidder, M.A. Scheel, and S.A. Teukolsky, A multidomain spectral method for solving elliptic 
equations. Computer physics communications 152 (2003), no. 3, 253-273. 

[15] L.N. Trefethen, Spectral methods in matlab, SIAM, Philadelphia, 2000. 


17 





